{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TAO Image Classification (TF2)\n",
    "\n",
    "Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. \n",
    "\n",
    "Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.\n",
    "\n",
    "<img align=\"center\" src=\"https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png\" width=\"1080\"> "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Learning Objectives\n",
    "In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:\n",
    "\n",
    "* Take a pretrained resnet18 model and finetune on a sample dataset converted from PascalVOC\n",
    "* Prune the finetuned model\n",
    "* Retrain the pruned model to recover lost accuracy\n",
    "* Export the pruned model\n",
    "* Run Inference on the trained model\n",
    "* Export the pruned and retrained model to a .onnx file for deployment to DeepStream\n",
    "\n",
    "At the end of this notebook, you will have generated a trained and optimized `classification` model\n",
    "which you may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps)\n",
    "or [DeepStream](https://developer.nvidia.com/deepstream-sdk).\n",
    "\n",
    "### Table of Contents\n",
    "This notebook shows an example use case for classification using the Train Adapt Optimize (TAO) Toolkit.\n",
    "\n",
    "0. [Set up env variables and map drives](#head-0)\n",
    "1. [Installing the TAO Launcher](#head-1)\n",
    "2. [Prepare dataset and pretrained model](#head-2)\n",
    "    1. [Split the dataset into train/test/val](#head-2-1)\n",
    "    2. [Download pre-trained model](#head-2-2)\n",
    "3. [Provide training specification](#head-3)\n",
    "4. [Run TAO training](#head-4)\n",
    "5. [Evaluate trained models](#head-5)\n",
    "6. [Prune trained models](#head-6)\n",
    "7. [Retrain pruned models](#head-7)\n",
    "8. [Testing the model](#head-8)\n",
    "9. [Visualize inferences](#head-9)\n",
    "10. [Export and Deploy!](#head-10)\n",
    "11. [Verify the deployed model](#head-11)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Set up env variables and map drives <a class=\"anchor\" id=\"head-0\"></a>\n",
    "\n",
    "The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/classification_tf2`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.\n",
    "\n",
    "*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*\n",
    "\n",
    "*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Setting up env variables for cleaner command line commands.\n",
    "import os\n",
    "\n",
    "%env NUM_GPUS=1\n",
    "%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/classification_tf2\n",
    "%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data\n",
    "\n",
    "# Set this path if you don't run the notebook from the samples directory.\n",
    "# %env NOTEBOOK_ROOT=~/tao-samples/classification_tf2\n",
    "\n",
    "# Please define this local project directory that needs to be mapped to the TAO docker session.\n",
    "# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps\n",
    "# in this notebook will be stored at $LOCAL_PROJECT_DIR/classification_tf2\n",
    "# !PLEASE MAKE SURE TO UPDATE THIS PATH!.\n",
    "os.environ[\"LOCAL_PROJECT_DIR\"] = FIXME\n",
    "\n",
    "os.environ[\"LOCAL_DATA_DIR\"] = os.path.join(\n",
    "    os.getenv(\"LOCAL_PROJECT_DIR\", os.getcwd()),\n",
    "    \"data\"\n",
    ")\n",
    "os.environ[\"LOCAL_EXPERIMENT_DIR\"] = os.path.join(\n",
    "    os.getenv(\"LOCAL_PROJECT_DIR\", os.getcwd()),\n",
    "    \"classification_tf2\"\n",
    ")\n",
    "\n",
    "# The sample spec files are present in the same path as the downloaded samples.\n",
    "os.environ[\"LOCAL_SPECS_DIR\"] = os.path.join(\n",
    "    os.getenv(\"NOTEBOOK_ROOT\", os.getcwd()),\n",
    "    \"specs\"\n",
    ")\n",
    "%env SPECS_DIR=/workspace/tao-experiments/classification_tf2/tao_voc/specs\n",
    "\n",
    "# Showing list of specification files.\n",
    "!ls -rlt $LOCAL_SPECS_DIR"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Mapping up the local directories to the TAO docker.\n",
    "import json\n",
    "import os\n",
    "mounts_file = os.path.expanduser(\"~/.tao_mounts.json\")\n",
    "\n",
    "# Define the dictionary with the mapped drives\n",
    "drive_map = {\n",
    "    \"Mounts\": [\n",
    "        # Mapping the data directory\n",
    "        {\n",
    "            \"source\": os.environ[\"LOCAL_PROJECT_DIR\"],\n",
    "            \"destination\": \"/workspace/tao-experiments\"\n",
    "        },\n",
    "        # Mapping the specs directory.\n",
    "        {\n",
    "            \"source\": os.environ[\"LOCAL_SPECS_DIR\"],\n",
    "            \"destination\": os.environ[\"SPECS_DIR\"]\n",
    "        },\n",
    "    ],\n",
    "    \"DockerOptions\":{\n",
    "        \"user\": \"{}:{}\".format(os.getuid(), os.getgid())\n",
    "    }\n",
    "}\n",
    "\n",
    "# Writing the mounts file.\n",
    "with open(mounts_file, \"w\") as mfile:\n",
    "    json.dump(drive_map, mfile, indent=4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cat ~/.tao_mounts.json"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Installing the TAO launcher <a class=\"anchor\" id=\"head-1\"></a>\n",
    "The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.\n",
    "\n",
    "Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running\n",
    "\n",
    "```sh\n",
    "export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x\n",
    "```\n",
    "where x >= 6 and <= 8\n",
    "\n",
    "We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:\n",
    "* python >=3.7, <=3.10.x\n",
    "* docker-ce > 19.03.5\n",
    "* docker-API 1.40\n",
    "* nvidia-container-toolkit > 1.3.0-1\n",
    "* nvidia-container-runtime > 3.4.0-1\n",
    "* nvidia-docker2 > 2.5.0-1\n",
    "* nvidia-driver > 455+\n",
    "\n",
    "Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below\n",
    "\n",
    "```sh\n",
    "docker login nvcr.io\n",
    "```\n",
    "\n",
    "You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# SKIP this cell IF you have already installed the TAO launcher.\n",
    "!pip3 install nvidia-tao"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# View the versions of the TAO launcher\n",
    "!tao info"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Prepare datasets and pre-trained model <a class=\"anchor\" id=\"head-2\"></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will be using the pascal VOC dataset for the tutorial. To find more details please visit \n",
    "http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#devkit. Please download the dataset present at http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar to $DATA_DOWNLOAD_DIR."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check that file is present\n",
    "import os\n",
    "DATA_DIR = os.environ.get('LOCAL_DATA_DIR')\n",
    "print(DATA_DIR)\n",
    "if not os.path.isfile(os.path.join(DATA_DIR , 'VOCtrainval_11-May-2012.tar')):\n",
    "    print('tar file for dataset not found. Please download.')\n",
    "else:\n",
    "    print('Found dataset.')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# unpack \n",
    "!tar -xvf $LOCAL_DATA_DIR/VOCtrainval_11-May-2012.tar -C $LOCAL_DATA_DIR "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# verify\n",
    "!ls $LOCAL_DATA_DIR/VOCdevkit/VOC2012"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### A. Split the dataset into train/val/test <a class=\"anchor\" id=\"head-2-1\"></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pascal VOC Dataset is converted to our format (for classification) and then to train/val/test in the next two blocks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# install pip requirements\n",
    "!pip3 install tqdm\n",
    "!pip3 install \"matplotlib>=3.3.3, <4.0\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Prepare Pascal VOC dataset for experiment\n",
    "!python prepare_voc.py JPEGImages split"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls $LOCAL_DATA_DIR/split/test/cat"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### B. Download pretrained models <a class=\"anchor\" id=\"head-2-2\"></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Installing NGC CLI on the local machine.\n",
    "## Download and install\n",
    "%env CLI=ngccli_cat_linux.zip\n",
    "!mkdir -p $LOCAL_PROJECT_DIR/ngccli\n",
    "\n",
    "# Remove any previously existing CLI installations\n",
    "!rm -rf $LOCAL_PROJECT_DIR/ngccli/*\n",
    "!wget \"https://ngc.nvidia.com/downloads/$CLI\" -P $LOCAL_PROJECT_DIR/ngccli\n",
    "!unzip -u \"$LOCAL_PROJECT_DIR/ngccli/$CLI\" -d $LOCAL_PROJECT_DIR/ngccli/\n",
    "!rm $LOCAL_PROJECT_DIR/ngccli/*.zip \n",
    "os.environ[\"PATH\"]=\"{}/ngccli/ngc-cli:{}\".format(os.getenv(\"LOCAL_PROJECT_DIR\", \"\"), os.getenv(\"PATH\", \"\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# !ngc registry model list nvidia/tao/pretrained_classification_tf2:*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# !mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_efficientnet_b0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pull pretrained model from NGC\n",
    "# !ngc registry model download-version nvidia/tao/pretrained_classification_tf2:efficientnet_b0 --dest $LOCAL_EXPERIMENT_DIR/pretrained_efficientnet_b0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Check that model is downloaded into dir.\")\n",
    "# !ls -l $LOCAL_EXPERIMENT_DIR/pretrained_efficientnet_b0/pretrained_classification_tf2_vefficientnet_b0"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Provide training specification <a class=\"anchor\" id=\"head-3\"></a>\n",
    "* Training dataset\n",
    "* Validation dataset\n",
    "* Pre-trained models\n",
    "* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!cat $LOCAL_SPECS_DIR/spec.yaml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Run TAO training <a class=\"anchor\" id=\"head-4\"></a>\n",
    "* Provide the sample spec file and the output directory location for models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p $LOCAL_EXPERIMENT_DIR/output\n",
    "!sed -i \"s|RESULTSDIR|$USER_EXPERIMENT_DIR/output|g\" $LOCAL_SPECS_DIR/spec.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!tao model classification_tf2 train -e $SPECS_DIR/spec.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"To run this training in data parallelism using multiple GPU's, please uncomment the line below and \"\n",
    "      \"update the num_gpus parameter to the number of GPU's you wish to use.\")\n",
    "# !tao model classification_tf2 train -e $SPECS_DIR/spec.yaml num_gpus=2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"To resume from a checkpoint,  just relaunch training with the same spec file.\")\n",
    "# !tao model classification_tf2 train -e $SPECS_DIR/spec.yaml num_gpus=2"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Evaluate trained models <a class=\"anchor\" id=\"head-5\"></a>\n",
    "\n",
    "In this step, we assume that the training is complete and the model from the final epoch (`efficientnet-b0_080.tlt`) is available. If you would like to run evaluation on an earlier model, please edit the spec file at `$SPECS_DIR/spec.yaml` to point to the intended model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the last checkpoints\n",
    "last_checkpoint = ''\n",
    "for f in os.listdir(os.path.join(os.environ[\"LOCAL_EXPERIMENT_DIR\"],'output', 'train')):\n",
    "    if f.startswith('efficientnet-b'):\n",
    "        last_checkpoint = last_checkpoint if last_checkpoint > f else f\n",
    "print(f'Last checkpoint: {last_checkpoint}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set LAST_CHECKPOINT in the spec file\n",
    "%env LAST_CHECKPOINT={last_checkpoint}\n",
    "!sed -i \"s|EVALMODEL|$USER_EXPERIMENT_DIR/output/train/$LAST_CHECKPOINT|g\" $LOCAL_SPECS_DIR/spec.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!tao model classification_tf2 evaluate -e $SPECS_DIR/spec.yaml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Prune trained models <a class=\"anchor\" id=\"head-6\"></a>\n",
    "* Specify pre-trained model\n",
    "* Equalization criterion\n",
    "* Threshold for pruning\n",
    "* Exclude prediction layer that you don't want pruned (e.g. predictions)\n",
    "\n",
    "Usually, you just need to adjust `prune.threshold` for accuracy and model size trade off. Higher `threshold` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. 0.68 is just a starting point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Specifying the checkpoint to be used for the pruning.\n",
    "!tao model classification_tf2 prune -e $SPECS_DIR/spec.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Pruned model:')\n",
    "print('------------')\n",
    "!ls -rlt $LOCAL_EXPERIMENT_DIR/output/prune"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. Retrain pruned models <a class=\"anchor\" id=\"head-7\"></a>\n",
    "* Model needs to be re-trained to bring back accuracy after pruning\n",
    "* Specify re-training specification"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p $LOCAL_EXPERIMENT_DIR/output_retrain\n",
    "!sed -i \"s|RESULTSDIR|$USER_EXPERIMENT_DIR/output_retrain|g\" $LOCAL_SPECS_DIR/spec_retrain.yaml\n",
    "!sed -i \"s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/output/prune/model_th=0.68_eq=union.tlt|g\" $LOCAL_SPECS_DIR/spec_retrain.yaml\n",
    "\n",
    "!cat $LOCAL_SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tao model classification_tf2 train -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Retrain with QAT (Optional)\n",
    "!mkdir -p $LOCAL_EXPERIMENT_DIR/output_retrain_qat\n",
    "!sed -i \"s|RESULTSDIR|$USER_EXPERIMENT_DIR/output_retrain_qat|g\" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml\n",
    "!sed -i \"s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/output/prune/model_th=0.68_eq=union.tlt|g\" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml\n",
    "\n",
    "!cat $LOCAL_SPECS_DIR/spec_retrain_qat.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tao model classification_tf2 train -e $SPECS_DIR/spec_retrain_qat.yaml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Testing the model! <a class=\"anchor\" id=\"head-8\"></a>\n",
    "\n",
    "In this step, we assume that the training is complete and the model from the final epoch (`efficientnet-b0_080.tlt`) is available. If you would like to run evaluation on an earlier model, please edit the spec file at `$SPECS_DIR/spec_retrain.yaml` to point to the intended model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the last checkpoints\n",
    "last_checkpoint = ''\n",
    "for f in os.listdir(os.path.join(os.environ[\"LOCAL_EXPERIMENT_DIR\"], 'output_retrain', 'train')):\n",
    "    if f.startswith('efficientnet-b'):\n",
    "        last_checkpoint = last_checkpoint if last_checkpoint > f else f\n",
    "print(f'Last checkpoint: {last_checkpoint}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set LAST_CHECKPOINT in the spec file\n",
    "%env LAST_CHECKPOINT={last_checkpoint}\n",
    "!sed -i \"s|EVALMODEL|$USER_EXPERIMENT_DIR/output_retrain/train/$LAST_CHECKPOINT|g\" $LOCAL_SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tao model classification_tf2 evaluate -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# evaluate with the QAT model\n",
    "# get the last checkpoints\n",
    "last_checkpoint = ''\n",
    "for f in os.listdir(os.path.join(os.environ[\"LOCAL_EXPERIMENT_DIR\"],'output_retrain_qat', 'train')):\n",
    "    if f.startswith('efficientnet-b'):\n",
    "        last_checkpoint = last_checkpoint if last_checkpoint > f else f\n",
    "print(f'Last checkpoint: {last_checkpoint}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set LAST_CHECKPOINT in the spec file\n",
    "%env LAST_CHECKPOINT={last_checkpoint}\n",
    "!sed -i \"s|EVALMODEL|$USER_EXPERIMENT_DIR/output_retrain_qat/train/$LAST_CHECKPOINT|g\" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tao model classification_tf2 evaluate -e $SPECS_DIR/spec_retrain_qat.yaml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 9. Visualize Inferences <a class=\"anchor\" id=\"head-9\"></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To see the output results of our model on test images, we can use the `tao inference` tool. Note that using models trained for higher epochs will usually result in better results. We'll run inference with the directory mode. You can also use the single image mode."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tao model classification_tf2 inference -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cat $LOCAL_EXPERIMENT_DIR/output_retrain/inference/result.csv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run inference with the QAT model\n",
    "!tao model classification_tf2 inference -e $SPECS_DIR/spec_retrain_qat.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cat $LOCAL_EXPERIMENT_DIR/output_retrain_qat/inference/result.csv"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 10. Export and Deploy! <a class=\"anchor\" id=\"head-10\"></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p $LOCAL_EXPERIMENT_DIR/export\n",
    "\n",
    "!sed -i \"s|EXPORTDIR|$USER_EXPERIMENT_DIR/export|g\" $LOCAL_SPECS_DIR/spec_retrain.yaml\n",
    "!tao model classification_tf2 export -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check if onnx model is correctly saved.\n",
    "!ls -l $LOCAL_EXPERIMENT_DIR/export"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the `tao deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.\n",
    "\n",
    "The `tao deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao deploy` command which will instantiate a deploy container, with the exported `.onnx` file on your target device. The `tao deploy` container only works for x86, with discrete NVIDIA GPU's.\n",
    "\n",
    "For the jetson devices, please download the tao-converter for jetson and refer to [here](https://docs.nvidia.com/tao/tao-toolkit-archive/tao-30-2108/text/tensorrt.html#installing-the-tao-converter) for more details.\n",
    "\n",
    "If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.onnx` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert to TensorRT engine (FP32).\n",
    "!tao deploy classification_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert to TensorRT engine (INT8).\n",
    "!sed -i \"s|fp32|int8|g\" $LOCAL_SPECS_DIR/spec_retrain.yaml\n",
    "!tao deploy classification_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Exported model:')\n",
    "print('------------')\n",
    "!ls -lh $LOCAL_EXPERIMENT_DIR/export/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert QAT model to TensorRT engine\n",
    "!mkdir -p $LOCAL_EXPERIMENT_DIR/export_qat\n",
    "!sed -i \"s|EXPORTDIR|$USER_EXPERIMENT_DIR/export_qat|g\" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml\n",
    "!tao model classification_tf2 export -e $SPECS_DIR/spec_retrain_qat.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Exported QAT model:')\n",
    "print('------------')\n",
    "!ls -lh $LOCAL_EXPERIMENT_DIR/export_qat/"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 11. Verify the deployed model <a class=\"anchor\" id=\"head-11\"></a>\n",
    "\n",
    "Verify the converted engine by visualizing TensorRT inferences."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!sed -i \"s|batch_size: 256|batch_size: 16|g\" $LOCAL_SPECS_DIR/spec_retrain.yaml\n",
    "# Running inference \n",
    "!tao deploy classification_tf2 inference -e $SPECS_DIR/spec_retrain.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cat $LOCAL_EXPERIMENT_DIR/output_retrain/inference/result.csv"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10 (default, Mar 15 2022, 12:22:08) \n[GCC 9.4.0]"
  },
  "vscode": {
   "interpreter": {
    "hash": "5b3ded1ccb95c1d9bd405e7b823d9e85424cde40fbb5985eb47e999ef50e15b4"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
